Exactly Solvable Model for Helix-Coil-Sheet Transitions in Protein Systems 
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In view of the important role helix-sheet transitions play in protein aggregation, we introduce 
a simple model to study secondary structural transitions of helix-coil-sheet systems using a Potts 
model starting with an effective Hamiltonian. This energy function depends on four parameters 
that approximately describe entropic and enthalpic contributions to the stability of a polypeptide 
in helical and sheet conformations. The sheet structures involve long-range interactions between 
residues which are far in sequence, but are in contact in real space. Such contacts are included 
in the Hamiltonian. Using standard statistical mechanical techniques, the partition function is 
solved exactly using transfer matrices. Based on this model, we study thermodynamic properties of 
polypeptides, including phase transitions between helix, sheet, and coil structures. 

PACS numbers: 87.15.Cc, 87.15.A-, 64.60.De 



In late the 1950s and early 1960s, Zimm and Bragg 
(ZB) and Lifson-Roig (LR) studied helix-coil transitions 
of simple models of homopolypeptides by employing rig- 
orous statistical methods based on partition functions 
and transfer matrices [l]. In the 1970s and 1980s, 
these models were extended to include copolymers and 
medium-ranged interactions, and were used to character- 
ize the experimental results of all amino acids and many 
proteins [2j. Because of the close coupling between the 
theoretical and experimental studies, ZB, LR, and re- 
lated models have stimulated much interest in helix-coil 
transitions [3y , which is still an active field of research up 
to the present time [3, 0- For reviews, see Ref. Q. 

However, conformation changes of polypeptides involv- 
ing sheet structures, such as helix-sheet transitions, are 
not as well characterized as for helix-coil transitions. 
In the late 1970s, using a multi-state model, Tanaka 
and Scheraga Q considered extended and chain-reversal 
states in addition to helix-coil transitions. In Ref. [7|, 
medium-range interactions were taken into account to 
study helices, extended structures, and coils. More re- 
cently, Mattice and Scheraga [8] , Sun and Doig [9] , Hong 
and Lei [HI, and others have included sheet structures 
in statistical models for homo-polypeptides. The diffi- 
culties in constructing models for sheets lie primarily in 
the interactions between residues that are long-range in 
sequence but are close in physical space, and in the rich 
variety of structures associated with sheets, turns, and 
loops, thus a large number of parameters required for 
their description. In this article, we introduce a sim- 
ple statistical mechanical model for helix-coil-sheet tran- 
sitions of homo-polypeptides, starting with an effective 
Hamiltonian. Instead of an Ising-like model, the treat- 
ment is built on a multi-state Potts model, which is ca- 
pable of explicitly describing some of the long-range in- 
teractions exhibited by sheet structures. The objective is 



that this simple model extends the helix-coil treatments 
to protein systems with three or more secondary struc- 
tures. 

An important step in a statistical mechanical approach 
like ZB, LR, Ising and Potts models is to construct the 
partition function for the system, based on which all ther- 
modynamic properties are obtainable. As in ZB and LR 
models, partition functions factorize in terms of transfer 
matrices. However, ZB or LR theories start with a com- 
binatorial partition function without defining an effective 
Hamiltonian. More generally, if an energy function H{i) 
is defined, where i — (i\, . . . , i n ) and i n is the micro-state 
of the nth residue which could occupy one of q possible 
states (conformations) labeled as {1,2, ... ,q}, the par- 
tition function for a system of N residues with periodic 
boundary conditions reduces to 
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where f3 = (fc^T) -1 , ks is Boltzmann's constant, and 
Tr is the matrix trace operation. The dimension of a 
transfer matrix in a one-dimensional (ID) Ising model is 
2x2 and for a g-state Potts model, the dimension of a 
transfer matrix is q x q. For Potts models with long-range 
interactions of range L along a ID chain, as Glumac and 
Uzelac [ll| showed in their formulation, the dimension 
of a transfer matrix becomes q L x q L . Eq. (1) may be 
further simplified by diagonalizing the transfer matrix T. 
More recently, Hamiltonians of polypeptide chains 
have been described using a variety of Ising-like models 
UaLL2|,[l3] and Potts models [14|,[l5|], and also using an ab 
initio model [5]. In particular, the WSME model |12l.[l3j 
uses two terms to construct an effective Hamiltonian and 
partition function: (1) the free energy term associated 
with the entropic cost of forming a pair of native residue 
conformations with restricted dihedral angles and (2) an 
enthalpic term associated with solvent-mediated contact 
energies between residues. Thus, residues may be either 
native or denatured, but not specific enough to distin- 
guish sheets from helices. Our approach to polypeptides 



is based on a Potts model, where residues could assume 
many conformations including sheet, helix, coil, and turn. 
Before discussing the full helix-coil-sheet system, let us 
consider the simpler case of helix-coil transitions where 
an effective (q — 2) Potts Hamiltonian (free energy in re- 
ality) can be written for a protein consisting of N residues 
as 
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where we assign i n = 1 to a residue in helix confor- 
mation and i n = 2 to a residue in coil conformation. 
The subscript 'he' in —(3Hh c means 'helix-coil' and '1' 
in hi and J\ refers to helix. The meanings of these pa- 
rameters are similar to those described in the WSME 
model, where hi < refers to an entropic cost from 
converting a coil to a helical residue, and J\ > refers 
to a contact energy between residues. In the present 
article, contact energies Ji are free-energies associated 
with solvent-mediated interactions, including hydrogen 
bonds, van der Waals, polar interactions, etc. The Kro- 
necker delta 6(1, i n ) yields one if the nth residue is helical, 
and zero otherwise. In the second term of Eq. ©, the 
range k determines the range of interaction. In a-helices, 
where k equals 3, residues at positions n — 3, n — 2, and 
n — 1 are all helical when an H-bond forms between the 
(n — A)th and nth residues. Additionally, the (n — 4)th 
and nth residues are not required to have the same con- 
formation; in fact they could be in any conformation. 
When k — 1, the effective Hamiltonian becomes —f3Hh c 

= ''l E n= l *(!.*«)+ ^1 En=2 ^t 1 " «.i)*(«n-l, «n)- The 

second term in Eq. ^ is also similar to the Hamilto- 
nian of the GMPC model, which is a microscopic the- 
ory for helix-coil transitions based on a g-state Potts 
model 00. 

To write down an effective Hamiltonian suitable for /3- 
sheets, we need to include in it interactions up to length 
L along the polypeptide chain. Such a Hamiltonian can 
be constructed by adapting the long-range spin model of 
Glumac and Uzelac ll|. For a chain of N spins, their 
Hamiltonian can be written as 
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where K/ is distance-dependent. Fig. [TJa) illustrates 
a graphical representation of the L — 3 case and fa- 
cilitates the construction of transfer matrices for long- 
range Potts systems. For Potts systems on a ID lat- 
tice, Glumac and Uzelac grouped the spins along a chain 
into columns of height L, the longest interaction length, 
transforming a long-range problem of spin interactions 
into a short range one relating nearest-neighbor columns 
of height L [ll|, [l7| , illustrated in Fig. HJb) . Each column 
of spins represents a vector that can take on one of q L 
possible states. The transfer matrix thus has dimension 
q L x q L . The various lines in Fig. Q] represent interac- 
tions Ki, K2, ■ ■ ■ , Kl in Eq. ([3]), and contribute to the 
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FIG. 1: (Color online) a) Graphical representation of the par- 
tition function for the case L=3. The black dots mark the lo- 
cations of particles along the chain. The dotted (blue) lines, 
K\, are nearest neighbor interactions. The dashed (green) 
lines, K-2., are next nearest-neighbor interactions. Solid (red) 
lines, K3, are the L—3 interactions, b) Graphical representa- 
tion of the transfer matrix T. 



partition function when the arguments in the Kronecker 
delta's are equal. 

Two modifications are made to apply the Glumac- 
Uzelac method of constructing transfer matrices to a 
protein system. Fig. 2(a) illustrates a segment of an 
anti-parallel /3-sheet, where interactions can occur be- 
tween residues which are remote in relative chain posi- 
tion, but are nearby in space. This is what is meant 
by 'long-range' in protein systems. Thus, the long- 
range nature of a protein system comes from labeling 
the residues according to the sequence order and does 
not come from the spatial distance between two residues. 
Even with the difference in the definition of long-rangc- 
ness, the Glumac-Uzelac method can be used in solv- 
ing the protein problem. The strengths of interactions 
between each residue-residue pair are similar and not 
dependent on the relative chain position I. This is a 
main difference between our Hamiltonian (see Eq. (5) be- 
low) and Eq. (J3J. For simplicity, in this article we shall 
consider all contacts between /3-strands are of the same 
strength. In making this modification, Eq. §3§ is recast 

as —0H = PKJ2i=i Z)n=i*(*n}*n+0) whfch drops the 
Z-dependence of K, but maintains the long-range nature 
of the Kronecker interactions. 

Secondly, according to Fig. [2ja) , two hydrogen bonds 
form between residue-residue pairs, which occur for ev- 
ery other residue along a strand terminating at the turn. 
On the other hand, the residues along the /3-strand that 
are not involved in hydrogen bonds with the opposite 
/3-strand, could be involved in hydrophobic interactions 
with the opposite strand. To simplify the model, we as- 
sume that every residue-residue pair along neighboring 
strands forms contacts of the same strength, as stated 
above. The following pattern then represents H-bonding 
or hydrophobic interactions between two residues along 
neighboring strands, which we identify as contacts: i\ — > 

k+L, h-tii+L-u ••• , *(l+i)/2-^(l+i)/2+i- In the 
present work, the turn conformation is also counted as 
a sheet conformation, but, in principle, the model can 
be extended to include specifically turn conformations if 
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FIG. 2: (Color online) a) A segment of an L = 11 anti-parallel /3-sheet chain. The sequence position of a residue is labeled and 
H-bonds are referenced by the dashed (red) lines, b) A simple pattern illustrating repeating L = 3 and nearest-neighbor contact 
interactions, denoted by dashed (red) lines. The solid (black) lines represent peptide bonds. In (c), a diagram representing the 
partition function for the structure in (b). The first column in (c) are residues ii,ia, 13, the second column are residues 14, is, i^, 
etc. Contacts are represented by dashed lines. The color of residues comprising the columns alternate in color from white to 
black, which corresponds to the residue pattern in (b). Repeated multiplication of matrices U and V generates the partition 
function for the whole chain. 



q > 3. The Kronecker delta's given in Eq. ([3]) are then 
modified to represent the aforementioned sheet-pattern. 
Additionally, for protein systems where the neighboring 
strands have the same interaction length L, the number 
of strands M, and the total number of residues N are 
related by 



the transfer matrix decomposes into a product of two 
matrices U and V, as illustrated by Fig. [^b) and (c). 
U and V are required to write out a general sequence 
of M strands and are explicitly written with the help of 
Fig. |2c) as 



N = MR, R=(L + l)/2 



(4) 
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We can write the two-state effective Hamiltonian for a 
pattern such as the one in Fig. &&) extended for any L, 
while taking into account the two modifications made to 
Eq. (3), as 



N 
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where we denote i n — 2 (coil), or 3 (sheet), b(ik, m ) — 
$ (3,ii-k+R(m+i)) an d only allows J3 terms to accumu- 
late when the residues at position k + R(m — 1) and 
1 — k + R(m + 1) are locked in a sheet conformation and 
are in contact. The term J3 > now represents contacts 
between sheet residues, h 3 < is the reduced entropic 
cost for coil to sheet conversions. The subscript 'sc' in 
—f3H sc refers to 'sheet-coil' and subscript '3' in /13 and 
J3 refers to sheet. Unlike in Eq. ©, we do not require 
all residues between two residues in contact to be locked 
into the sheet state. 

To see the general pattern described by the second 
term in Eq. ([5]), we start by considering the simplest 
L = 3 case as shown in Fig. [5Jb). In reality, the min- 
imal structure in Fig. &h) may not even be considered 
as a sheet structure, but nevertheless illustrates the gen- 
eral behavior that the transfer matrix can be decomposed 
into a product of sub-transfer matrices. For L = 3 case, 



where |i) and \j) are neighboring column vectors of length 
L, where, for example, in Fig. 2(c), they can be {i\ = 
(«i*2«3 and \j) — \i4i5ie), and x = exp{/3J3J. Each 
transfer matrix U and V has dimension q L x q L . This 
methodology works for any finite L, where the total num- 
ber of transfer matrices needed to generate a periodic 
pattern for general L is found to be equal to the total 
number of interactions over the distance L + 1, which 
happens to equal the number R |18| . For example, for 
the L = 3 case illustrated in Fig. 2(c), there are two in- 
teractions, a nearest-neighbor (for example, 12, 13, in Fig. 
2(c)) and one over the longest range of interaction (for 
example, i\,i&, in Fig. 2(c)) thus two matrices are suf- 
ficient. For illustrating purposes, we explicitly consider 
a simple model of anti-parallel sheet-helix-coil systems, 
which starts with a three-state (q — 3) effective Hamilto- 
nian with four parameters that can describe transitions 
between sheet, helix, and coil structures. Helical confor- 
mations are assumed to form contacts between nearest 
neighbors only, that is, the k = 1 case of Eq. (2). The 
total effective Hamiltonian can be written as 



/3-Hhcs — ~fiHhc — fiH s 



(7) 



where now i„ = 1, 2, or 3, refers to helix, coil, and sheet, 
respectively, and the subscript 'hes' in ~(3Hh cs refers to 
'helix-coil-sheet'. The partition function can be written 
in the form of Eq. ([1]) , when periodic boundary conditions 
are imposed, and calculated using transfer matrices, sim- 
ilar to the L = 3 case as illustrated in Fig. [2fb) and (c). 




276 



288 300 
T(K) 



312 



O 



160 

120 

80 

40 









$ (b) 
i 1 


i--i L= 3 - 
O-O L= 5 
V-V L= 7 - 

Q--Q L = 9 

* -a L = 11 - 





276 



282 288 
T(K) 



294 




276 278 280 
T(K) 



282 



FIG. 3: (Color online) All calculated quantities using J\ — 2.85 kcal/mol, J3 = 2.45 kcal/mol, hi — -4.91, and hz = -4.20. 
a) Order parameters for the case L = 11 with M = 100. b) Heat capacity (kcal/mol/K) vs. T for various strand lengths L 
with M = 100. c) The same plot as in (b) with more details of the helix-sheet transition given. Black dots denote transition 
temperatures which increase with range parameter L. 



The parameters hi and Ji are chosen so that the he- 
lix state is the most stable conformation at the lowest 
temperature in the interested temperature range. The 
coil dominates at high temperatures, where contact en- 
ergies become relatively weak compared to thermal fluc- 
tuations. The sheet is thus an intermediary state [19j . 
For some proteins, the sheet is seen as the most stable 
conformation at low temperature, where the helix confor- 
mation becomes an intermediary state [20|. Our model 
can accommodate this case as well as a variety of others 
with proper choices of parameters. 

For systems with fixed numbers of residues, the parti- 
tion function facilitates calculation of numerous thermo- 
dynamical quantities, such as the average energy, (E), 
the heat capacity, C, and the order parameters, Oi, which 
are the average fractional content of ith state among q 
conformations at a particular temperature. To calcu- 
late the partition function, we choose a multi-stranded 
/3-barrel system, which serves as an example of a protein 
system satisfying periodic boundary conditions. Insert- 
ing Eq. (9) into Eq. (1) and differentiating, we have for 
such a system 
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respectively. In Fig. [3ta), the order parameters for he- 
lix, coil, and sheet are presented for the case L = 11, 



M = 100, and in Fig. [3](b) and (c), we plot the tempera- 
ture dependence of the heat capacity for various L cases. 
The heat capacity curve show two peaks: the sharp, low- 
temperature peak signifies the helix-sheet transition, and 
the broad, high-temperature peak signifies the sheet-coil 
transition. These peak positions are approximately given 
by the crossing points of 0i, shown in Fig. 3(a), between 
the helix and sheet and between the sheet and coil curves. 

In conclusion, we have shown that, for a simple pat- 
tern associated with anti-parallel /3-sheet structures, an 
effective Hamiltonian using a minimal number of param- 
eters and its corresponding partition function can be con- 
structed to study its helix-coil-sheet transitions. The 
partition function can be exactly computed by means 
of transfer matrices, which are used to calculate thermo- 
dynamical properties of the system, including the order 
parameters for helices and sheets and the heat capacity, 
which show that increasing strand length, L, plays a sta- 
bilizing role in the protein. 
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